East meets West: Producing Multilingual Resources in a European Context

نویسندگان

  • Tomaž Erjavec
  • Ann Lawson
  • Laurent Romary
چکیده

The EU concerted action TELRI has released a two-volume CD-ROM, which contains multilingual language resources, namely corpora, lexica, and tools for language engineering. This CD-ROM provides harmonised resources for unprecedented numbers and kinds of languages, mainly from non-EU countries, for which such resources still tend to be scarce. The first volume of the CD-ROM includes the aligned text of Plato's Republic in twenty one languages plus other tools and resources, while the second volume contains extended results of the EU MULTEXT-East project, including the aligned and tagged novel '1984' by George Orwell and accompanying lexica in seven languages. The paper presents the CD-ROM, the methods employed in its creation and its prospective uses.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

E-Learning Strategy for South East European University to Enable Borderless Education

In this paper we present a strategy for implementing E-Learning at South East European University. The developed strategy takes into account the University’s mission in achieving a so-called borderless education within the regional Balkans context, but also in a wider European and global context. A number of issues related to such a specific context of the University, such as its multilingual a...

متن کامل

MULTEXT-East Version 4: Multilingual Morphosyntactic Specifications, Lexicons and Corpora

The paper presents the fourth, “Mondilex” edition of the MULTEXT-East language resources, a multilingual dataset for language engineering research and development, focused on the morphosyntactic level of linguistic description. This standardised and linked set of resources covers a large number of mainly Central and Eastern European languages and includes the EAGLES-based morphosyntactic specif...

متن کامل

Tags and self-organisation: a metadata ecology for learning resources in a multilingual context

Social tags offer a novel aspect to study learning resources, its metadata and how users interact with them. The key theme in this research is to understand the central role of social tagging for Technology Enhanced Learning (TEL), more specifically, for digital learning resources in a multilingual context. The main hypothesis is that the self-organisation aspect of a social tagging system on a...

متن کامل

MULTEXT-East Version 3: Multilingual Morphosyntactic Specifications, Lexicons and Corpora

The paper presents the third edition of the MULTEXT-East language resources, a multilingual dataset for language engineering research and development. This standardised and linked set of resources covers a large number of mainly Central and Eastern European languages and includes the EAGLES-based morphosyntactic specifications, defining the features that describe word-level syntactic annotation...

متن کامل

Orientel: speech-based interactive communication applications for the mediterranean and the middle east

In this paper, we introduce a new European project named OrienTel. The aim of OrienTel is to enable the project's participants to design and develop multilingual interactive communication services for the Mediterranean and the Middle East, ranging from Morocco in the West to the Gulf states in the East, including Turkey and Cyprus. These multilingual applications will be largely speech-based an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011